High Performance Spatial Data Mining: Scalable Methods for Spatial Autoregression A THESIS SUBMITTED TO THE FACULTY OF THE GRADUATE SCHOOL OF THE UNIVERSITY OF MINNESOTA BY

نویسندگان

  • Baris Mustafa Kazar
  • David J. Lilja
  • Shashi Shekhar
  • Baris M. Kazar
چکیده

Explosive growth in the size of spatial databases has highlighted the need for spatial data mining techniques to mine the interesting but implicit spatial patterns within these large databases. This thesis deals with reducing the computational complexity of the exact and approximate spatial autoregression (SAR) model solutions. Estimation of the parameters of the SAR model using Maximum Likelihood (ML) theory is computationally very expensive because of the need to compute the logarithm of the determinant (log-det) of a large matrix in the log-likelihood function. The first part of the thesis introduces theory on SAR model solutions. The second part applies parallel processing techniques to the exact SAR model solutions. We proposed parallel formulations of the SAR model parameter estimation procedure based on ML theory using data parallelism with load-balancing techniques. Although this parallel implementation showed scalability up to eight processors, the exact SAR model solution still suffers from high computational complexity and memory requirements. These limitations have led us to investigate serial and parallel approximate solutions for SAR model parameter estimation. In the third part of the thesis, we present two candidate approximate-semi-sparse solutions of the SAR model based on Taylor’s Series expansion and Chebyshev Polynomials. We showed that the differences between exact and approximate SAR parameter estimates have no significant effect on the prediction accuracy. We developed a new ML based approximate SAR model solution and its variants in the next part of the thesis. The new approximate SAR model solution is called the GaussLanczos approximated SAR model solution. We algebraically rank the error of the Chebyshev Polynomial approximation, Taylor’s Series approximation and the Gauss-Lanczos approximation to the solution of the SAR model and its variants. In other words, we established a novel relationship between the error in the log-det term, which is the approximated term in the concentrated log-likelihood function and the error in estimating the SAR parameter ρ for all of the approximate SAR model solutions.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Graduate Students’ Satisfaction with Supervisory Performance in Thesis Process in University of Social Welfare and Rehabilitation (USWR) in 2011-2012

Introduction: This research was designed to determine the satisfaction level of postgraduate students of USWR with supervisory performance of supervisor and advisor in Thesis process in 1390-1391. Methods: This descriptive cross-sectional study was performed on postgraduate students of USWR who were preparing for thesis defense between 2011 October to 2012 June. Samples were selected through c...

متن کامل

A new method to consider spatial risk assessment of cross-correlated heavy metals using geo-statistical simulation

The soil samples were collected from 170 sampling stations in an arid area in Shahrood and Damghan, characterized by prevalence of mining activity. The levels of Co, Pb, Ni, Cs, Cu, Mn, Sr, V, Zn, Cr, and Tl were recorded in each sampling location. A new method known as min/max autocorrelation factor (MAF) was applied for the first time in the environmental research works to de-correlate these ...

متن کامل

Parallel Spatial Pyramid Match Kernel Algorithm for Object Recognition using a Cluster of Computers

This paper parallelizes the spatial pyramid match kernel (SPK) implementation. SPK is one of the most usable kernel methods, along with support vector machine classifier, with high accuracy in object recognition. MATLAB parallel computing toolbox has been used to parallelize SPK. In this implementation, MATLAB Message Passing Interface (MPI) functions and features included in the toolbox help u...

متن کامل

Spatial-frequency bandwidth requirements for pattern vision A DISSERTATION SUBMITTED TO THE FACULTY OF THE GRADUATE SCHOOL OF THE UNIVERSITY OF MINNESOTA BY MiYoung Kwon IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY Adviser:

............................................................................................................... IV LIST OF TABLES ....................................................................................................... X LIST OF FIGURES ................................................................................................... XI LIST OF EQUATIONS ...........................

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005